Equidistribution of joinings under off-diagonal 
polynomial flows of nilpotent Lie groups 



Tim Austin 



Abstract 

Let G be a connected nilpotent Lie group. Given probability-preserving 
G-actions {Xi, 'Ei, fj,i,Ui), i = 0,1, ... ,k, and also polynomial maps (pi : 
M. — > G, i = 1, . . . , fc, we consider the trajectory of a joining A of the 
systems {X.^, E^, ^i,u.i) under the 'off-diagonal' flow 

{t, {xo,xi,X2, Xk)) i-^ (xo, U2''*-*^X2, . . . , u'^'''^*'^ Xk). 

It is proved that any joining A is equidistributed under this flow with respect 
to some limit joining A'. This is deduced from the stronger fact of norm con- 
vergence for a system of multiple ergodic averages, related to those arising in 
Furstenberg's approach to the study of multiple recurrence. It is also shown 
that the limit joining A' is invariant under the subgroup of G''^^ generated 
by the image of the off-diagonal flow, in addition to the diagonal subgroup. 
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1 Introduction 

This paper is set among jointly measurable probabihty -preserving actions of a con- 
nected nilpotent Lie group G. We will assume in addition that G is simply con- 
nected; it will be clear from the statements of our main results that by ascending to 
the universal cover this incurs no real loss of generality. 

Suppose that Ui : G r\ (Xj, Sj, /ij) for i = 0, 1, . . . , is a tuple of such actions 
and that A is a joining of them. This means that A is a coupling of the measures /Xj 
on the product space Hi -^i' ^rid that it is invariant under the diagonal transfor- 
mation 



Ti^ := Uq X X • • • X 



for every g ^ G. 



Taking the G-actions on each coordinate separately, the Uj together define a jointly 
measurable action of the whole Cartesian power G^^^ on Xi according to 

^{90,9i,...,S..) -^90 ^^91 X ... x^zf . 

In these terms ua may be identified with the restriction of Ux to the diagonal 
subgroup 

QHk+i) :={(5,g,...,5): geG}<G^+\ 



An arbitrary joining A need not be Ux -invariant. However, the main result of this 
paper implies that for any one-parameter subgroup M — > G'^^^, the trajectory of 
A under the Ux -action of that subgroup must equidistribute with respect to some 
new joining A' that is also invariant under that subgroup. Moreover, this statement 
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generalizes to averages over the trajectory of any map M — > Qk+i ^j^^^ 'polyno- 
mial' in the sense that repeated group-valued differencing leads to the trivial map 
(precise definitions are recalled in Section @). The full result is the following. 

Theorem 1.1 If{Xi, Sj, ^Uj, m), < i < k, and A are as above, and ipi : M — > G 
for 1 < i < k are polynomial maps satisfying (pi{0) = e (the identity ofG), then 
the averaged measures 

Jo 

converge in the coupling topology as T — > oo to some joining X' of the systems 
{Xi, Y^i, HijUi) which is invariant under the restriction ofux to the subgroup 

(G^('=+i)u{(e,<^i(t),...,<^fc(t)): teR}). 

Here we have used the standard analyst's notation := j^, and we write (5) 
for the smallest closed subgroup of G containing S. 

Remark If t is such that (^j(t) / e then the individual measures 

(idxoXnrWx...xn^'=W),A 

may not be joinings of the original actions. As measures they are still couplings of 
the /ij, but the invariance of A under the diagonal subgroup has been replaced with 
invariance under its conjugate 

(e, ifiit),..., Mt)) ■ G^^'^') • (e, Mt), Mt))-'. 

Thus a non-trivial part of the conclusion of Theorem 1 1.1 l is that the smoothing effect 
of averaging over t recovers the invariance under G^^''^^^ (and likewise under all 
of these conjugates). < 

Convergence At — > A' in the coupling topology, as in Theorem [TTTl asserts that 
/o O /i • • • (8) /fc dAT — y / /o (8) /i (8) • • • O A dA' 

for any choice of /o € L°°(^o)> /i S L°°{fii), . . . , fk S L'^ink)- Informally, it 
is a variant of weak convergence defined against the class of test functions given 
by tensor products of bounded measurable functions on the individual coordinate- 
spaces. It is standard that this topology on the convex set of couplings is compact: 
see, for instance. Theorem 6.2 of Glasner [18|. 
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However, we will actually deduce Theorem 11.11 from a stronger kind of conver- 
gence. For any joining A and any fixed choice of fi G L°°{^i) for I < i < k, the 
map 

J XoxXix---xXf. 

defines a bounded linear functional on L^(/xo), and hence by the self-duality of 
Hilbert space it specifies a function 

M^(/i,...,/fc) eL2(^o) 

(an alternative, more concrete description of can be found in Section |2] be- 
low). The joining convergence asserted by Theorem ll.ll is equivalent to the weak 
convergence in L^(^o) of the averages 

A'rifi, ■■■,fk):= f M\h o . ..Jko dt, 

JO 

but in fact the methods we call on below (particularly the van der Corput estimate, 
Lemma IaTI ) naturally give more: 

Theorem 1.2 In the setting ofTheorem \l.l\ the averages A^{fi, . . . , fjS) converge 
in norm in L^(/io) as T — > oo for any functions fi € L°°{^i), 1 < i < k. 

Of course, this does not immediately imply the remainder of Theorem 11.1 I concem- 
ing the extra symmetries of the limit joining. That will require some additional 
argument. 

The problem of pointwise convergence of the averages remains open, and the 
methods of the present paper probably say very little about it. One related special 
case (for certain discrete-time averages) has been established by Bourgain in [|12J . 
but I know of no more recent extensions of his work. 

Origin and relation to other works 

Theorem 11.11 has its origin in the study of multiple recurrence. Furstenberg's 
original Multiple Recurrence Theorem [,15] asserts that for a single probability- 
preserving transformation T r> {X, E, n), if A £ has ij.{A) > then also 

1 ^ 

liminf — y^(yinr-Mn---nr-('=-^)M) >0 VA; > l. (1) 

n=l 
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In particular, there must be a time n > 1 at which 

^i{A n T-'^A n • • • n r^c^-i)"^) > o : 

this is '/c-fold multiple recurrence' for A. 

Furstenberg studied this phenomenon in order to give a new proof of a deep theo- 
rem of Szemeredi in additive cominatorics ||39l . which can be deduced quite easily 
from the Multiple Recurrence Theorem. Following Furstenberg 's original paper, 
many other works have either proved analogous multiple recurrence assertions in 
more general settings or analysed the 'multiple' ergodic averages of the kind ap- 
pearing in ([T}, in particular to determine whether they converge. We will not at- 
tempt to give complete references here, but refer the reader to HI, to the paper 11211 
of Host and Kra and to Chapters 10 and 11 of Tao and Vu's book 1*411 for more 
details. 

Many of these convergence questions can be phrased in terms of convergence of 
joinings, much in the spirit of Theorem ll.il In Furstenberg's original setting, if we 
let be the copy of ^ supported on the diagonal in X'^, then the above averages 
may be re-written as 

/ 1a iX) 1a "Xi • • • (8) 1a d/^Af , 

where 

1 ^ 

n=l 

so in fact the convergence of these scalar averages is almost precisely the assertion 
that the orbit of the joining under the off-diagonal idx x T x • • • x T^~^ 
is equidistributed relative to some limit joining. Convergence here follows from 
work of Host and Kra ||2T1 (see also Ziegler P4l ). and it is worth noting that in this 
situation the additional invariance of the limit joining under idx x T x • • • x T^~^ 
is obvious from the definition of the ^ n and the F0lner property of the intervals 
{l,2,...,iV} CZ. 

On the other hand, that additional invariance can be put at the heart of an alterna- 
tive proof of convergence, which also applies to the more general question of the 
convergence of the averaged joinings 

1 ^ 

-^(idxxrix...xrfe)>^ 

n=l 

for a commuting tuple of transformations Ti, T2, . . . , r\ {X, S, ^u): see lUKll 
(and compare with Tao |'40'], where the first proof of convergence for this higher- 
rank setting was given using very different methods). This more general setting 
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still exhibits a multiple recurrence phenomenon with striking combinatorial con- 
sequences, as shown much earlier by Furstenberg and Katznelson |[T6l . Another 
aspect of the study of the limit of the above joinings is that a sufficiently detailed 
understanding of its structure can be used to give an alternative proof of their the- 
orem [6]. 

Having come this far, it is natural to ask after the behaviour of these averaged 
joinings if Ti, . . . , do not commute, but generate some more complicated 
discrete group. In particular, if they generate a nilpotent group, then Leibman has 
shown that multiple recurrence phenomena still occur [23] using an extension of 
Furstenberg and Katznelson's arguments, but that approach does not prove that the 
associated functional averages converge in L^(^). The question of convergence 
seems to be closely related to whether the averages 

^^(idxxrP^(")x...xTf(")),/z'^ 

n=l 

converge for a Z'^-action T and polynomials pi : Z — > U^, at least insofar as some 
of the standard methods in this area (particularly the van der Corput estimate) run 
into very similar difficulties in the contexts of these two problems. 

These more general convergence questions were posed by Bergelson as Question 
9 in m, having previously been popularized by Furstenberg. Several special cases 
were established in |]T7ll9l|20l|25lll|l3l. On the other hand, the paper flOl contains 
an example in which k = 2, (Ti, T2) is a two-step solvable group, and convergence 
fails. 

Shortly before the present paper was submitted, Miguel Walsh offered in [i43i a 
proof of convergence for general nilpotent groups and tuples of polynomial maps, 
so answering the question of Furstenberg and Bergelson in full generality. His 
proof is most akin to Tao's convergence proof in [40 1, but clearly involves some 
non-trivial new ideas as well. It is quite different from the very 'structural' ap- 
proach taken by most ergodic theoretic papers, such as the present one. It seems 
likely that Walsh's approach can be adapted to prove convergence in our setting 
(Theorem 1 1.21 i. but it gives much less information on the structure of the resulting 
factors and joinings (as, for example, in the rest of Theorem 11.31 ). 

Our Theorem II . 2 [ establishes the analog of the conjecture of Furstenberg and Bergel- 
son (involving both nilpotent groups and polynomial maps) for continuous-time 
flows. In Subsection 110.21 we will offer some discussion of the additional diffi- 
culties presented by an adaptation of our approach to the discrete-time setting. It 
would still be of interest to find a successful such adaptation, since it would pre- 
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sumably require uncovering a more detailed description of the relevant factors and 
joinings, and so would comprise a substantial complement to the approach via 
Walsh's methods. 

We should note also that the case G = R*^ in Theorems 11.11 and 11.21 was recently 
established in [2]. However, the methods below diverge quite sharply from that 
previous paper. That work relied crucially on making a time change t i— )• in the 
integral averages under study for some small a > 0, in order to convert averages 
along polynomial orbits into averages along orbits given by a linear map perturbed 
by some terms that grow at sublinear rates in t. That trick leads to a substantial 
simplification of the necessary induction on families of polynomials (in that pa- 
per Bergelson's PET induction is not needed, since something more direct suffices, 
whereas this induction scheme will appear in the present paper shortly), and so cuts 
out various other parts of the argument that we use below. However, I do not know 
how to implement this time-change trick for maps into general nilpotent groups, 
essentially because various commutators that appear during the proof can give rise 
to high-degree terms which disrupt the choice of any particular a used to make 
the leading-order terms linear. It is also my feeling that the argument given be- 
low reveals rather more about the relevant structures within probability -preserving 
G-actions that are responsible for the asymptotic behaviour of the averages in The- 
orem [TTT] 

Although it emerges from the study of multiple recurrence. Theorem 1 1.1 1 fits neatly 
into the general program of equidistribution. Equidistribution phenomena for se- 
quences in compact spaces, and especially sequences arising from dynamical sys- 
tems, have been popular subjects of analysis for most of the twentieth century: see, 
for instance, the classic text [22|. Theorem 1 1.1 1 can be seen as a close analog of 
more classical results concerning special classes of compact topological systems: 
in place of the orbit of an individual point or distinguished subset, we study the 
orbit of an initially-given joining, and correspondingly vague convergence of mea- 
sures (that is, tested against continuous functions on a compact space) is replaced 
by convergence in the coupling topoology. 

Of course, equidistribution theorems for topological systems always rely very cru- 
cially on the special structure of the system under study. Among arbitrary actions 
on compact spaces there are plentiful examples for which the set of invariant prob- 
abilities is very large and unstructured, and which have many points that do not 
equidistribute. It is interesting that once a tuple of systems (Xj, Sj,//j,Mj) with 
invariant probabilities has been fixed, their joinings exhibit the behaviour of The- 
orem [TTT] without any extra assumptions on those individual systems. Instead, the 
necessary provisions are that we start with the orbit of some joining, rather than of a 
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single point, and then prove equidistribution in the sense of the coupling topology. 

Among the most profound results giving equidistribution for concrete systems are 
those concerning the orbits of unipotent flows on homogeneous spaces. In this 
setting the heart of such an analysis is typically a classification of all invariant 
probability measures on a system, which then restricts the possible vague limits 
one can obtain from the empirical measures along an orbit of the system so that, 
ideally, one can prove that the empirical measures have only one possible limit 
(and so are equidistributed). 

To some extent the approach to Theorem 11.11 parallels that strategy, in that the 
additional invariances of the limit joinings are an important tool in the proof, and 
our arguments do imply some further results on the possible structure of the limit 
joinings (see the second remark following Proposition I8.21 i. 

The full strength of measure classification for probabilities on homogeneous spaces 
that are invariant and ergodic under the action of a subgroup generated by unipotent 
elements was finally proved by Ratner in ||35l |36l . building on several important 
earlier works of herself and others. The monograph f28l gives a thorough account 
of this story. Following Ratner's work. Shah proved in [37] some equidistribution 
results for trajectories of points in homogeneous spaces under flows given by regu- 
lar algebraic maps into the acting group. That notion of 'polynomial' encompasses 
ours in many cases, and so his work offers a further point of contact between the 
two settings. 

However, the details of the arguments used below are rather far from those devel- 
oped by Ratner and her co-workers. For instance, in Shah's paper, he first shows 
that any vague limit measure for the trajectory of a point under one of his regular 
algebraic maps must have some invariance under a nontrivial unipotent subgroup. 
In light of this he can restrict his attention to the possible limit measures that are 
permitted by Ratner's Measure Classification Theorem, whereupon the extra anal- 
ysis needed can proceed. By contrast, it is essential in our work that we allow 
general polynomial maps into G throughout, since our induction would not remain 
among homomorphisms even if we started there. It would be interesting to know 
whether an alternative approach to Theorem ll.ll can be found which is more in line 
with those works on homogeneous space dynamics. 

First outline of the proof 

Theorems 11.11 and [ 1 . 21 will be proved by induction on the tuple of polynomial maps 
(v'l) '^2, ■ ■ ■ j^k)- The ordering on polynomials that organizes this induction is (a 
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variant of) Bergelson's PET ordering from [7 1, which has become a mainstay of 
the study of multiple averages involving nilpotent groups or polynomial maps. 

To a large extent, the new innovation below is the formulation of an assertion that 
includes Theorem [T7T] and can be closed on itself in this induction. The delicacy of 
this formulation is largely attributable to the van der Corput estimate (Lemma fA.!!) . 
which relates the averages involving a given tuple of polynomial maps to another 
tuple that precedes it in the PET ordering. In the first place, it is this that forces 
us to prove Theorem 11.21 alongside Theorem 11.11 but it will also required other 
features in our inductive hypothesis. 

An application of this lemma converts an assertion about a tuple of polynomial 
maps 

t H> ipi{t) 

into another about the 'differenced' maps 

{t,s)^^iit + s)ip^{t)-^ (2) 

(or more complicated relatives of these: see Section H). Regarded as functions of 
t alone, these precede the tuple {ipi, . . . ,Lpk) in the PET ordering for any fixed s. 
In many applications of PET induction one simply forms these derived maps, then 
fixes a value of s and applies an inductive hypothesis to the restrictions of these 
new maps to M x {s}. Unfortunately, in our setting there can be some values of 
s for which the behaviour of these restrictions is not as 'good' as our argument 
needs. To overcome this we must retain the picture of the new maps in (O as being 
polynomial on the whole of M x M. As a consequence of this polynomial structure 
and certain general results about actions of nilpotent Lie groups (see Section |5]), 
one finds that these averages behave 'asymptotically the same' for all but a small 
set of exceptional values of s. This turns out to be a crucial improvement over 
the possible worst-case behaviour over s. Since repeated appeals to the van der 
Corput estimate lead to a proliferation of these differencing parameters s, we must 
actually formulate a theorem which allows for polynomial maps M x M^' — > G, 
where we average over the first coordinate in R x W and the theorem promises 
some additional good behaviour for generic values of the remaining r coordinates. 

The right notion of genericity to make this precise is provided by Baire's definition 
of category, but transplanted into the Zariski topology of M" (which is not Haus- 
dorff and so not quite in the usual mould for applications of Baire category). The 
required notion of 'Zariski genericity' will be defined in Section [3l and will be 
found to relate very well to other standard notions of 'smallness' for subsets of M". 

In terms of this definition, the complete statement that will be proved by PET 
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induction is as follows. 



Theorem 1.3 Suppose that (Xi,T,i, ^i,Ui), < i < k, and A are as above and 
that (fi '.M. 'xW — > G, 1 < i < k, are polynomial maps satisfying (/3j(0, •) = e. 
Let be constructed from A as previously, let 

A^ifi, . . . , A) := / M\h o ur^*''\ . ..Jko n^^*-'^) dt 
Jo 

(so implicitly depends on h), and let 
Then 

1. for any h and any fi G L°°(^i), 1 < i < k, the functional averages 
A^{fi, . . . , /fc) converge in L^(//o) T — >• oo, 

2. for any h the averaged joinings 

Jo 

converge as T — > oo to some limit joining which is invariant under 

U img '^{■,h)), 

and 

3. the map h is Zariski generically constant on E, and the generic value 
it takes is a joining invariant under 

(G^^'^+i) U img 

This clearly implies both of the previous theorems. The rest of the paper is directed 
towards the proof of Theorem 11.31 
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Overview of the paper 



Sections |2] through |6] establish certain background results that we will need for the 
main proofs, concerning general properties of group actions and representations; 
polynomial maps and genericity in the Zariski topology; finer results about actions 
of nilpotent Lie groups; and the technology of 'idempotent' classes of probability- 
preserving systems. Once all this is at our disposal, the proof of Theorem 11.31 
is completed in Sections |7j [8] and |9l Finally Section [10] contains a discussion of 
various further questions related to those of this paper. 

2 Background on group actions 

If G is a locally compact second countable ('l.c.s.c.') group, then a G-system is 
a quadruple {X, S, /x, u) in which {X, S, fi) is a standard Borel probability space 
and g 1-^ is a jointly measurable, /x-preserving left action of G on (X, S). 
Sometimes this situation will alternatively be denoted by u : G {X, S, /x), and 
sometimes a whole system will be denoted by a boldface letter such as X. 

Relatedly, a G-representation is a strongly continuous orthogonal representation 
vr of G on a separable real Hilbert space S). (It would be more conventional to 
work with complex Hilbert spaces and unitary representations, but choosing the 
real setting avoids the need to keep track of several complex conjugations later.) 
This situation will often be denoted hyn : G S). Given a G-system {X, S, ^, u), 
the associated Koopman representation u* : G r\ L"^ (^) is defined by 

where this convention concerning inverses ensures that both u and u* are left ac- 
tions. Here and throughout the paper the notation L^, 1 < p < oo, is used for 
real Lebesgue spaces. It is classical that the joint measurability of u implies the 
strong continuity of u* (see, for instance. Lemma 5.28 of Varadarajan |@2]), so the 
Koopman representation is a G-representation in the present sense. 

Given a G-system and a closed subgroup H < G, one may construct the a- 
subalgebra 

:= {A G S : fi{u''AAA) = V/i G H}. 

If H is normal in G then this is globally G-invariant, and hence defines a factor of 
the original system which we call the i^-partially invariant factor. For some quite 
special technical reasons we will need only the case of normal H in this paper: see 
Corollary l5.2l below. 
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Similarly, for a G-representation vr we let 

Fix(7r(i7)) ■.= {v eSj: TT{h)v = vyhe H} <S); 
for Koopman representations it is easily seen that 

Fix{u*{H))=L'{fi\^H). 

Sometimes it is necessary to compare actions of different groups. If q : H — G 
is a continuous homomorphism of l.c.s.c. groups and X = (X, S,^, n) is a G- 
system, then we may define an ff-system on the same probability space by letting 
h act by u^(^). We denote this system by X'?( ) = (X, S, /i, A similar 

construction is clearly possible for representations. 

We will also need certain standard calculations involving couplings and joinings. 
Suppose that A is a coupling of fiQ, /ii, /i^ (without any assumption about 
group actions). We may regard it instead as a coupling of {Xq, Sq, /^o) with 

(Xi X ••• xXfc,Si®---(g)Sfc,A') 

where A' is the marginal of A on the last k coordinates. Now A can be disintegrated 
over the first coordinate to obtain a probability kernel 

A:Xo — > Pr(Xi X • • • X Xfc, Si (g) • • • ® Sfc) 

so that 



JXo 

and this, in turn, defines a multilinear map 

: L°^(^i) X ... X L^ifik) L'^il^o) 

according to 

M\fi, fk){xo) := [ h®f2®---®fk dA(xo, 



Clearly one has 



/0-M^(/l,...,/fc)d^0= / /o®/lC5---0/A:dA, 

Xq J X(iXXiX---xXk 

so this agrees with the definition of by duality given in the Introduction. 

The following is now a routine re-formulation of the definition of a relatively inde- 
pendent product, and the proof is omitted; see, for instance, the third of Examples 
6.3 in Glasner iHl. 
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Lemma 2.1 Let A : Xq — > Vx{Xi x • • • x Xk) be as above and define the relative 
product measure X^^Xon Xf x • • • x X| by 



A(g)oA=/ A(xo,-) ® A(xo,-)^o(dxo). 

JXo 



Then for any fi,gi G L°°{^i), 1 < i < k, one has 



[ M\hj2,...Jk)-M\g_ 



'1,92, 



■■■,9k) d/^o 



JXlx----xXl 



fi "S) 91 ^ f2® 92^ ■ ■ ■ "S) fk^ 9kd{X (8)0 A). 



□ 



3 Real polynomials and Zariski residual sets 

The third part of Theorem ll.3l involves the notion of Zariski genericity. Recall that 
on M" (or any other real algebraic variety) the Zariski topology is the topology 
whose closed sets are the subvarieties. Although the failure of M to be algebraically 
closed gives rise to certain novel behaviour not seen in more classical algebraic 
geometry (especially under projection maps), in this paper we will not meet any of 
the situations in which this matters. The basic notions of the theory can be found in 
many books that use algebraic groups, such as in Subsection D.l of Starkov [,38,1 . 
The additional idea we need from that arena is the following. 

Definition 3.1 (Zariski meagre and residual sets) A subset C is Zariski 
meagre if it can be covered by a countable family of proper subvarieties o/M". A 
subset of is Zariski residual if its complement is Zariski meagre. A property 
that depends on a parameter h € is Zariski generic if it obtains on a Zariski 
residual set of h. 

Since proper subvarieties are always closed and nowhere dense in the Euclidean 
topology, Zariski residual sets are residual in the Euclidean topology. They are 
therefore 'large' in the sense of the Baire Category Theorem and its consequences, 
but in a much more structured way than an arbitrary Euclidean-residual subset. In 
particular, they exhibit the following simple behaviour under slicing: 
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Lemma 3.2 IfEC is Zariski meagre and V C is any affine subspace then 
either E V or E nV is Zariski meagre in V. In the space of translates /V, 
the subset of translates for which the former holds is Zariski meagre. 

Proof This is simply a consequence of the corresponding property of Zariski 
closed sets. □ 

Zariski meagre sets are also small in a natural measure-theoretic sense. 

Lemma 3.3 A Zariski meagre subset E C has Hausdorff dimension at most 
n — 1. 

Proof Clearly it suffices to show that a single proper algebraic subvariety V C 
M" has Hausdorff dimension at most n — 1, and moreover that this holds when 
V = {f = 0} for some nonzero polynomial / : — > M (because any proper V 
can be contained in such a zero-set). 

This follows by induction on degree. If / is linear then it is immediate, so suppose 
deg / > 2. Then on the one hand the nonsingular locus {/ = 0} fl {V/ ^ 0} 
can be covered with countably many open sets on which {/ = 0} n {V/ ^ 0} 
locally agrees withasmooth (n— l)-dimensional submanifold of M", and hence has 
Hausdorff dimension n — 1. On the other hand, the remaining set {/ = 0} fl {V/ = 
0} is contained in the set {i{Vf) = 0} for any choice of ^ € (M")* \ {0}, which 
is an algebraic variety generated by a polynomial of degree at most deg / — 1 and 
so has Hausdorff dimension at most n — 1 by the inductive hypothesis. □ 



4 Polynomial maps into nilpotent Lie groups 

Henceforth G will denote a connected and simply connected nilpotent Lie group, 
Q its Lie algebra, 

G = G^>G'^>--->G'>{e) 
its ascending central series, and 

= 0^ > 0^ > • • • > 0' > (0) 
the corresponding ascending series of 0. 

In the following we will need certain standard facts about such groups, in particular 
that the exponential map exp : g — y G is an analytic diffeomorphism and that any 
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Lie subalgebra f) < g exponentiates to a closed Lie subgroup of G, which is normal 
if and only if f) was an ideal. (Note that both of these require the assumption that G 
is simply connected as well as connected.) These can be found as Theorem 1.2.1 
and Corollary 1.2.2 in Corwin and Greenleaf [14], which provides a good general 
reference for the study of these groups. 

4.1 Polynomial maps 

Definition 4.1 (Polynomial map) A map ip : G' — > G between nilpotent Lie 
groups is polynomial if there is some d > 1 such that 

Vfti • • • VhiV = e V/ii, /i2, . . . , /id G G', 

where Vh^{g) := ip{gh-^)ip{g). 

This definition has come to prominence in the study of multiple recurrence phe- 
nomena since Leibman's work generalizing the Furstenberg-Katznelson Multiple 
Recurrence Theorem to tuples of transformations generating a nilpotent group [23 1. 
For maps into a module M over a ring R (such as an Abelian group, which is a 
module over Z), degree-d polynomial maps have been studied much more classi- 
cally as an ideal of functions G — > M annihilated under convolution by the d^^ 
power of the augmentation ideal of R[G]: see, for instance, Passi |[29l[30l . 

In this work we will need the above definition only for G' = M". If in addition 
G = M™, then it is a simple exercise to show that a map ip is polynomial according 
to the above if and only if it may be expressed as an m-tuple of polynomials in 
n variables. For general nilpotent targets G a more concrete view of polynomial 
maps is still available by the following standard proposition and corollary (for the 
former see, for instance. Proposition 1.2.7 in Corwin and Greenleaf fT4\). 

Proposition 4.2 If G is an s-step connected and simply connected nilpotent Lie 
group, then exp : q — > G is a diffeomorphism, and pulled back through exp the 
operations of multiplication and inversion become polynomial maps x g — > q 
and — > of degree bounded only in terms of s. □ 

Corollary 4.3 A map (/? : M" — > G is polynomial if and only if it is of the form 
exp for some polynomial ^ : M" — > q. 



15 



Proof This follows by induction on the nilpotency class of G. On the one hand, if 

$ : — > g is a polynomial, then after (deg $)-many applications of the differ- 
encing operator V, the exponentiated map exp 0$ may not vanish identically, but 
at least its projection to G/G"^ vanishes because this is isomorphic to the projec- 
tion of $ to g/g^. Thus finitely many differencing operations yield a polynomial 
map into g^, and now repeating this argument s times shows that the differences of 
exp 0$ do eventually vanish. 

On the other hand, if 99 : — > G is a polynomial map, then the same is true 
of : M" — > G/G^ ^ i^dimG-dimG2_ r^^^ ^^^^ ^j^^pjy isomorphic 

to (exp^^ oip) + g^ : — > s/fl^^ so this latter is a polynomial. By choosing 
lifts of its coefficients under the projection g — > g/g^, we obtain a polynomial 
$1 : — > g such that exp o(exp~^ oip — (ti) takes values in G^, and it is clearly 
still a polynomial map there using the argument of the previous paragraph. Now 
the inductive hypothesis applied to gives another polynomial $2 : — > g^ 
such that exp o(exp~^ oip — $1) = exp o$2, and re-arranging this completes the 
proof. □ 

By pulling back to the Lie algebra and arguing there, the above proposition and 
corollary have the following further consequence, which will be useful in the se- 
quel. 

Corollary 4.4 Ifip^ij}: M" — > G are polynomial maps, then so are the pointwise 
product X ip{x)'ip{x) and the pointwise inverse x ip{x)~^. □ 



4.2 Families of maps and the PET ordering 

Our attention now turns to finite tuples 

J^= {^1,^2, ■■■,'Pk) 
of polynomial maps M x M'' — > G. 

In what follows it is extremely important that we consider the domain of these maps 

to be split as R X R^. Although this is not really different from R^"*"^, the heart of 
the main induction below rests on comparing the degrees of different polynomial 
maps into G in the first coordinate only. Therefore we will henceforth restrict 
attention to maps defined on products of M with other real vector spaces, and will 
always regard the second coordinate as an auxiUary parameter. 

Definition 4.5 (Internal class; leading degree; leading term) For a polynomial 
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map Lp : M. y. W — > G with 99(0, •) = e, its internal class is the greatest c 
such that ^ img ip. It is denoted cl (/?. 

Given this, the projection 

: {t,h) ^ ^{t,h)G^+'^ : R X ^ G7G^+1 ^ KdimG^-dimG^+i 

is a Euclidean-valued polynomial map. The leading degree Ideg ip of ip is the 
degree of(pG'^^^ in the variable t, and the leading term of ip is the term in (pG'^^^ 
of the form t^^'^^'^'il){h) for some polynomial map il) : W~ — > G'^/G^^^. 



Definition 4.6 (Leading-term equivalence) Two polynomial maps 99, : R x 
R** — > G are leading-term equivalent, denoted ip ~lt V'* '/ cl = and 
ip and V' have the same leading term {hence certainly the same leading degree). 

Several further definitions are needed in order to explain the PET ordering that will 
steer the inductive proof of Theorem II .31 The next roughly follows Leibman ll23l . 



Definition 4.7 (Weight) The weight of a polynomial ip -.MxW — > G is the pair 
wtip := (cl(/?, Ideg 93). The set Wt of possible weights {c,d) is ordered lexico- 
graphically: pairs {c,d), {c',d') £ Wt satisfy {c,d) -< {c',d') if 

• either c > c', 

• or c= c' and d < d'. 



Since clearly ip ~lt V' implies wt 99 = wt ip, we may also define the weight of an 
^uj-equivalence class as the weight of any of its members. 

This is a well-ordering on Wt , and it now gives rise to a partial ordering on poly- 
nomial maps. 



Definition 4.8 (PET ordering on polynomials) Given two polynomial maps ip,ip : 
R X R^' — > G, the first precedes the second in the PET ordering, denoted 

f ^PET 4'' if^tip -< wtip. 

Remark Our -<pet is not quite the same as the PET ordering used in much of 
the earlier literature for polynomial maps into nilpotent groups. Those required 
a comparison between polynomials in terms of the individual members of some 
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Mal'cev basis of G; see, for instance, Section 3 in ll23l . Our ordering is actually 
a little weaker (in the sense that ^pet^^pet""*^ relations), because we com- 
pare our polynomials on the whole Euclidean subquotients of G arising from the 
ascending central series, and so in our ordering the assertion that two polynomi- 
als have the same leading term is stronger. However, when we later use the PET 
induction via the van der Corput lemma it will be clear that we are still moving 
strictly downwards among our families of polynomials, so that the induction pro- 
ceeds correctly. < 

The PET ordering on polynomials will play a role in the proof of the special case 
A; = 2 of Theorem 11.31 but the general case will require an extension of it to an 
ordering of tuples of polynomials. 

Definition 4.9 Suppose that f, g : Wt — > N are maps which each take nonzero 
values at only finitely many weights. Then f precedes g, denoted f < g, if there is 
some (c, d) G Wt such that 

• /(c', d') = g{c' , d') whenever (c', d') >- (c, d), and 

• f{c,d) < g{c,d). 

Definition 4.10 (PET ordering for tuples of polynomials) IfT = {ipi,ip2, . . . , ipk) 

is a tuple of polynomial maps then its weight assignment is the function Wt T : 
Wt — > N which to each (c, d) G Wt assigns the number of ^is^-equivalence 
classes of maps in T that have weight (c, d). 

Suppose now that T = {ipi,ip2, ■ ■ ■ , (fk) and Q = (V'l, ^2, ■ ■ ■ , '4't) are families of 
polynomial maps M x M*" — > G. Then T precedes Q, denoted J- ^pet Q, if 

• either Wt 7" Wt G, 

• or Wt = Wt G, and the sets of ^is^- equivalence classes T j ~lt and 
Q l^lST can be matched in such a way that ( i) their weights match, ( ii) every 
class of T has cardinality no larger than its corresponding class in Q, and 
( Hi) in at least one instance it is strictly smaller 

As in most proofs that use the PET ordering, it is needed for a particular pair of 
families of maps, one derived from the other according to the following definitions. 

Definition 4.11 (Pivot) If T = {ipi, (/J2, • • • , ^k) is a tuple of polynomial maps 
"KxW — > G then a pivot for T is a PET-minimal member ip £ J^. 
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Definition 4.12 (Derived family) Suppose that T = {ipi, (p2, - ■■ ,<fk) is a tuple 
of polynomial maps R xW — > G. Then for i < k its i^^ derived family consists 
of the following polynomial mapsR. x (R x W) — > G: 

{t,k,h) ^ipj{t,h)ipi{t,h)-'^ forj e {1,2,... ,k}\{i} 

and 

[t, k, h) i-> 'fj{k, h)~^ipj{t + k, h)ipi{t, h)~^ for j G {1,2, ... , k). 

Note that the pre-multiplication by ipj{k, h)~^ in the last line has the consequence 
that if ifj (0, •) = e for every i, then the same is true of the derived family. 

Lemma 4.13 IfT = ((^i, ^p2,■■■, Vfe) with ipi a pivot, then its first derived family 
precedes it in the PET ordering. Also, the sub-tuple {(p2, . . . ,ipk) precedes T in 
the PET ordering. 

Proof For each j >2 consider the polynomial maps 

(pj{t,h)(pi{t,h)~^ and (pj{k,h)~^(pj{t + k,h)(pi{t,h)~^ . 
Because ipi is a pivot, 

• either wt(^j >- wt(^i, 

• or wt ipj = wt (fi but ipj T^LT fi, 

• or (pj ~LT <Pi- 

In the first case both of the new maps above still have weight equal to wt cpj , and 
are actually leading-term equivalent by comparing their leading terms in G'^/G'^^^ 
for c = c\(pj. By the same reasoning, if (pj ~lt ^j' then all four of the resulting 
new maps are leading-term equivalent. 

The same conclusions hold when wt ipi = wt (pj but ipi /lt ^j, since in this case 
the leading term of either of the above maps into G'^/ G"^"*"^ is given by the nonzero 
difference of the leading terms of (pi and (pj. 

Lastly, if (pj ~lt fi, then these leading terms do cancel, and so both of the poly- 
nomial maps written above now strictly precede (pi in the PET ordering. 

Therefore overall the equivalence classes of T and of its 1*^* derived family are 
in bijective weight-preserving correspondence, apart from the equivalence class of 
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fi , which is replaced by (possibly several) classes in the derived family of strictly 
lower weight. This proves the first assertion. 

The second assertion is obvious, because the removal of tpi either removes a whole 
~LT-equivalence class in case (pi is in a singleton class, and hence reduces Wt T 
in ^, or leaves the ~LT-class structure of T unchanged but reduces the cardinality 
of exactly one of the classes. □ 

5 Finer results for actions of nilpotent Lie groups 

For any inclusion H < G of topological groups, will denote the topological 
normal closure of H in G: that is, the completion of the normal closure in G. 
This notation suppresses the dependence of this definition on the larger group G, 
which will always be clear from the context. Similarly, if G is a connected and 
simply connected Lie group with Lie algebra q and ^ < g is a Lie subalgebra, 
then y° denotes the Lie algebra generated by Ylig ^^id)^ (equivalently, the Lie 
ideal generated by V in q), so that exp(y^) = (exp 1/)°. 

The first important result we need is a consequence of the classical Mautner Phe- 
nomenon. We will make use of the following expression of this argument as iso- 
lated by Margulis ll27l : it can also be found as Lemma 2.2 in Subsection 2.1 of 
Starkov ll38l . 

Lemma 5.1 (Mautner Phenomenon) Suppose that tt : G is a orthogonal 
representation of a connected Lie group, that H < G is a connected Lie subgroup, 
that g £ G and that there are a sequences gi G and hi,h[ G H with gi — > e 
and gihig^^h[ — > g. Then 

Fix(7r(5)) D Fix(7r(F)). 

□ 

Corollary 5.2 If G is a connected and simply connected nilpotent Lie group, H < 
G is a connected closed subgroup and it : G r\ Sj is an orthogonal representation, 
then 

Fix(7r(i?)) = ¥iyL{TT{H'')). 
Similarly, if {X, H, n,u) is a G-system then 
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Proof We focus on the first claim, since the second follows at once by consider- 
ing the Koopman representation. 

A simple calculation shows that = {H[H, G]), where [H, G] is the subgroup 
generated by all commutators of elements of H with elements of G. Let 

G = G,t>G2>...>Gs> Gs+i = {e} 

be a central series of G in which each quotient Gr/Gr+i has dimension one; for 
example, one may insert extra terms into the ascending central series, as in the 
construction of a strong Mal'cev basis. Let be the Lie algebra of Gr and f) the 
Lie algebra of H. 

We will prove by downwards induction on r that if 1 < r < s then 

Fix(7r((i/[G,+i,F]))) = Fix(7r((F[G,, //]))). 

When r = s the left-hand side here is Fix(7r(//)), while when r = 1 the right-hand 
side is Fix(7r(//'^)), so this will complete the proof. 

When r = s the result is clear because Gg is central in G, so now suppose the 
result is known for some r + 1 < s. By replacing H with {H[Gr+i, H]), we may 
assume that they are equal, since another easy calculation shows that the sets 

iH[Gr+i,H]) ■ [G,+i, {H[Gr+i,H])] and H[Gr+i,H] 

generate the same subgroup of G. 

Let V £ Qr \ Qr+i, SO that Qr is the smallest Lie algebra containing both V and 
5r+i- The subgroup {H[Gr, H]) is connected, and its Lie algebra is the smallest 
Lie subalgebra of q that contains both f) and {[F, [/] : {7 G f}}. It therefore suffices 
to show that any v G Fix(7r(i?)) is also fixed by exp([y, U]) for any U € [). 

This can be deduced using Lemma [5?T] We need to show that if [/ € P) then 
exp([y, [/]) is a limit of group elements of the form gihig^^h'^, as treated in that 
lemma. This follows from the Baker-Campbell-Hausdorff formula, which implies 
for any t > that 

exp(ty) exp((l/t)C/) exp(-tF) exp(-(l/t)C/) = exp([y, U]+0{t)) exp(i?(i)), 

where R{t) collects those multiple commutators that involve at least one copy of 
V and at least two entries from \), which must therefore lie in 
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Hence 

exp([F, U]) = exp(ty) exp((l/t);7) exp(-ty) ( exp(-(l/t);7) exp( 
so letting t = 1/i gives the conditions needed by Lemma [5TT] 



-R{t))), 
□ 



Corollary 5.3 IfG is a connected and simply connected nilpotent Lie group, Hi , H2 < 
G are connected closed subgroups and tt : G r\ is an orthogonal representation, 
then the subspaces 

Fix(7r(/7i)), Fix(7r(/72)) < ^ 
are relatively orthogonal over their common further subspace 

Fix(^((FiUi?2») 

(meaning that 

Fix(7r(Fi)) e Fix(^((Fi U H2))) ± Fix(^(i/2)) Fix(7r((i7i U H2))). ) 

Similarly, if {X, S, |U, u) is a G-system then and T,^'^ are relatively indepen- 
dent over j:{Hi'JH2)_ 

Proof For a Lie subgroup H < G, since 

Fix(7r(F)) = Fix(7r(i7°)) 

and < G, this subspace of S) is actually invariant under the whole action vr. 
Therefore the orthogonal projections Pi onto Fix(7r(//j)) both commute with vr. 

It follows that P1P2 has image contained in Fix(7r((//i U H2))). Since conversely 
any vector fixed by both Hi and H2 is also fixed by Pi and P2, it follows that P1P2 
is an idempotent with image equal to Fix(7r((i?i U H2))), and the same holds for 
P2Pi- Hence for any vectors n € and v G Fix(7r((/7i U ^^2))) one has 

{U,V) = {u,{PlP2)v) = {{P2Pl)u,v), 

SO in fact P2P1 is the orthogonal projection onto its image, and similarly for PiP2- 
Finally, if Vi € Fix(7r(ffj)) for i = 1,2 then this implies 

{vi,V2) = {PlVi,P2V2) = {P2PlVl,V2) = ((^2^1 , (P2Pl)^^2) , 

which is the desired relative orthogonality. 



22 



In the case of a G-system, applying the above result to the Koopman representation 
tells us that for any -measurable functions fi G L^ifii) for i = 1, 2 we have 

Jx Jx 

and this is the desired relative independence. □ 

Example The above proofs are intimately tied to the nilpotency of G, so it is 
worth including an example of a solvable Lie group G and representation tt : G r\ 
Sj to show that this restriction is really needed. 

Let p : M rv C be the rotation action defined by 



and let G := C xip M. This is a simple three-dimensional solvable Lie group; in 
coordinates it is C x M with the product 

{u, s) ■ {v, t) := (p*u + v,s + t). 

It may also be interpreted as a group extension of Z by the group C xi of 
orientation-preserving isometrics of C, and this picture gives an action ^ : G rv C 
with kernel isomorphic to Z. 

For each v G C let G^ be the isotropy subgroup {g e G : ^^v = v}. Then 
G„ = M, and G„ and G^ are conjugated by the 'translational' element {w — v,0) G 
G. Moreover, since any translation of C may be obtained as a composite of two 
rotations about different points, the groups G^ together generate G, and so G^^ = G 
for every v. A simple calculation shows that in coordinates one has 

G^ = {{v- p'{v),t) : t€R}. 
Now consider the action tt : G rv L^(msi) = L^(mgi) (8)]r C defined by 

where {p^^u, z) is the usual inner product of C regarded as a vector space over M. 
(A routine check shows that this formula correctly defines an action of G.) The 
subspace Fix(7r(G„)) consists of those functions / such that 

^2^i(p-'u,z)j^pt^^ = f{z) G S\ i € M : 

that is, of the constant complex multiples of the function z ^ Q-'i'^^{u,z) _ These are 
all distinct 2-real-dunensional subspaces of L^(msi ), so are not equal to Fix(7r(G)) = 
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{0}, and also (by considering close-by values of v, for instance) they are not pair- 
wise orthogonal. < 

Another useful result in a similar vein to Corollary 15.21 is the following simple 
relative of the Pugh-Shub Theorem 11321 . An adaptation of their theorem to the 
setting of nilpotent groups has previously been given by Ratner in Proposition 5.1 
of |[35l . Although our formulation is superficially different from hers, each version 
can easily be deduced from the proof of the other. 



Lemma 5.4 Let vr : G rx 9)be an orthogonal representation of a connected nilpo- 
tent Lie group, and let Lat g be the family of all proper Lie subalgebras ofg. Then 
the subfamily 

A:={V e Lat : Fix(7r(exp F)) ^ Fix(7r(G))} 
has countably many maximal elements. 



Proof Suppose that Vi,V2 € Aare two distinct maximal elements. Then the Lie 
subalgebra generated by Vi + V2 must strictly contain them both, and hence 

Fix(7r((expyi UexpFa))) = Fix(7r(G)), 

by their maximality. 

Corollary 15 . 3 1 now implies that Fix(7r(exp Vi)) and Fix(7r(exp V2)) are relatively 
orthogonal over Fix(7r(G)). Therefore there can be at most countably many of 
these maximal elements of A, because S) is separable: indeed, if ^1 C ^ were an 
uncountable collection of maximal elements, then choosing some representative 
unit vectors 

XV G Fix(7r(exp V)) Fix(^(G)) G Ai 

would give an uncountable sequence of orthonormal vectors in S), and hence a 
contradiction. □ 

Example It is certainly not true that Ai is generally finite. For example, consider 
the obvious rotation action of on and let (m']r2 ) be the result- 

ing orthogonal representation. Then any one-dimensional subgroup Mv < of 
rational slope has some non-trivial invariant functions, but the whole M^-action is 
ergodic. < 

This conclusion of countability (rather than finitude) gives rise to the need for the 
notion of Zariski genericity (rather than simply Zariski openness). The connection 
between them is established by the following. 
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Corollary 5.5 If^p : M x M** — > G is a polynomial map into a connected and sim- 
ply connected nilpotent Lie group and ir : G r\ is an orthogonal representation, 
then the map 

— > {subspaces of SS) : h H- Fix(7r((img c/?(-, /i)))) 

takes the fixed value Fix(7r((img (/?))) Zariski generically. Similarly, if{X, S, /i, u) 
is a G-system then the a-subalgebra Y^^^'^s.vi-^h)) ^gfgg^ 

with up to fl- 

negligible sets for Zariski generic h. 

Proof Replacing G with (img ip)'^ if necessary, we may assume they are equal. 

Let A < Lat g be the family of all Lie subalgebras with fixed-point subspaces 
strictly larger than Fix(7r(G)), as in Lemma [S!4l and let C ^ be the subfam- 
ily of maximal elements of A, so Lemma 15.41 shows that this is countable. Since 
Fix(7r(exp V^)) = Fix(7r(exp V)) for any V G Lat g by Corollary 15.21 by maxi- 
mality we must have V = for every V G Ai. 

Now, 

{h : Fix(^((img(^(.,/i)))) ^ Fix(^((img(^))) = Fix(^(G))} 

= \J {h: ipit,h) eexpVyt eR}, 
VeAi 

and so by the countability of Ai it suffices to show that each individual set {h : 
exp ip{t, h) ^ V\/t £M.} is proper and Zariski closed in W. Since Fix(7r((img ip))) = 
Fix(7r(G)), the subgroup (img ip) is not contained in exp V for any V € Ai, and 
so in fact img (p ^ exp V (since exp V is itself a subgroup). 

Therefore for any V £ Ai we may choose a linear form i G Q* which annihilates 
V but does not annihilate the whole of exp~^(img p), and now one has 

{h: pit,h) eexpVyteR} C{h: ^(exp-^((/j(t, /i))) = Vt G M}. 

However, the map (t, h) £{exp~^{ip{t, h))) is a polynomial M x M" — M, by 
Corollary 14.31 By collecting monomials it may be expressed as 

f^Pdih) + t^-^Pd-i{h) + ■■■ + tpi{h)+po{h) 

for some pi G R[hi, . . . ,hr], and now 

d 

{h : i{exp-^{p{t,h))) = Vt G M} = : pi{h) = 0}. 

i=0 
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This is manifestly a real algebraic subvariety of M", and it is proper because the 
map i o exp^^ oip was chosen so as not to vanish identically, so it is a Zariski 
meagre subset of M", as required. 

Once again the conclusion about G-systems follows at once by considering Koop- 
man representations. □ 



6 Idempotent classes 



The final ingredients needed for the proof of Theorem 11.31 are some results on 
'idempotent classes' of probability-preserving systems. These were introduced 
in [3] ffll building on the earlier notion of a 'pleasant extensions' of systems m 
(and also worth comparing with Host's 'magic extensions' from 1191 ). 



Definition 6.1 (Idempotent and hereditary classes) For any l.c.s.c. group G, a 
class C of jointly-measurable, probability-preserving G-systems is idempotent if 
it is closed under measure-theoretic isomorphisms, inverse limits and arbitrary 
joinings. It is hereditary if it is closed under passing to factors. 



Example The leading examples of idempotent classes are those of the form 

• • • V c^^ 

for some closed normal subgroups Hi, H2, ■ ■ ■ , H£ < G, where this denotes the 
class of all G-systems which can be expressed as a joining of systems Yi, Y2, . . . , 
Y£ where each Yj has trivial ffj-subaction. < 

The reference contains an introduction to idempotent classes in the case of a 
discrete acting group. In earlier works, idempotent classes were introduced to set 
up the theory of 'sated extensions' of probability-preserving systems, which then 
play the primary role in applications of these ideas. However, sated extensions 
are a little inconvenient in the present setting, and so we will work instead with 
some more elementary results about idempotent classes. The reasoning behind this 
change of perspective relates to the need to change the group that acts on a system, 
which will appear in Section [8] 

In addition, our interest here is in actions of Lie groups, for which these ideas have 
not previously appeared in the literature. Therefore the basic definitions and results 
we need have been included below for completeness. Only very simple changes 
and additions are needed to the treatments in UJ or We will also introduce a 
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slightly novel example of an idempotent class, useful for handling the polynomial 
maps of the present setting. 

Lemma 6.2 (C.f. Lemma 2.2.2 in fl\) If C is an idempotent class of G-systems 
and X = (X, T,, n,u) is any G -system, then X has an essentially unique largest 
factor A < S that may be generated by a factor map to a member ofC. 

Proof It is clear that under the above assumption the family of factors 

{H < S : H is generated by a factor map to a system in C} 

is nonempty (it contains {0, X}, which corresponds to the trivial system), upwards 
directed (because C is closed under joinings) and closed under taking cr-algebra 
completions of increasing unions (because C is closed under inverse limits). There 
is therefore a maximal cj-subalgebra in this family. □ 

Definition 6.3 (Maximal C-factors) The factor A obtained in the preceding lemma 
is the maximal Q-factor of (X, n), and will sometimes be denoted by the 
(slightly abusive) notation CS. Similarly, we will sometimes denote by CX a choice 
of a member ofC such that CT, can be generated by a factor map X — > CX. 

The importance of idempotent classes derives from the following proposition. 

Proposition 6.4 (Joinings to members of idempotent classes) Suppose that C is 
a hereditary idempotent class of G-systems, that X = (X, S, /i, u) is any G-system 
and Y = (Y, <I>, v, v) is a member ofC. Then for any joining 

Z = {X xY,T.(E)<^,\,uxv) 




where vr and ^ are the coordinate projections, there is some further factor A of 
X which is generated by a factor map to a member of C and such that the fac- 
tor 7r^^(Il) is relatively independent from ^^^(<I?) over 7r^^(A). Concretely, this 
means that 

I f{x)g{y)\{dx,dy)= [ E^{f\A){x)g{y)X{dx,dy) 
Jz Jz 

for any f E L°°(fi) and g G L°°{i') (so we do not require that 7r~^(A) also be 
contained in ,^^^(<I>) up to negligible sets). 
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Proof We will construct from the joining A a new joining of X with a C-system 
such that A is relatively independent over a factor of X which in that new joining 
is actually determined by the coordinate in the C-system. 

Let A : X — > Pr y be a disintegration of A over the coordinate projection to X. 
Form the infinite Cartesian product 

Z' := X X 

and let A' be the {u x ■u^^)-invariant measure obtained as the relatively independent 
product of copies of A: 

A'= / A(x,-)®^Md^)- 
Jx 

Let tt' : Z' — > X be the first coordinate projection, and let Ai be the image of A' 
under the projection to Y^. 

Finally, let A < E be the u-algebra of those sets which are A'-a.s. determined by 
the remaining coordinates of Z': 

A := G S : 3Be s.t. A'((^ x y^) A(X x B)) = 0}. 

This is clearly a factor of X, and by definition it also specifies a factor of the 
system (y^, <I>®^, Ai, w (since each A G A is identified with a member of 
uniquely up to negligible sets). Let A' := (7r')~^(A), so up to negligible sets 
this is measurable with respect to either vr' or the coordinate projection Z' — > Y^. 
The system (y^, Ai, t;^^) is a member of C, because Y G C and C is closed 
under joinings; and hence the factor of X generated by A is also in C, because it 
may be identified with a factor of that member of C and C is hereditary. 

Now let / G L°°{fi) and g G L'^^u). To prove the desired equality of integrals, it 
suffices to show that 

E^(/|A) = =^ EA(/o7r|{0,X}®#) = O, 

since an arbitrary / may be decomposed as E^(/ | A) + (/ — E^(/ | A)), and this 
decomposition inserted into the two integrals against g then shows that they are 
equal. 

Thus, suppose conversely that 

g:= EA(/o7r|{0,X}®$)7^O, 
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and hence 



J^{foTr)-gdX = \\g\\l^O. 



For each i G N let aj : Z' — > Y be the coordinate projection to the i^^ copy 
of Y, and let gi := g o ai. By the construction of A', the pair of coordinates 
(7r,Q;i) : Z' — > Z has the distribution A for any i. This has the following two 
consequences: 

• for any M > 1 one has 

/ (/°^')(TjE5-)dA'= ifo7r)-gd\=\\gg^0; 

• for all i one has 

Ey{gi I S ® {0, y^}) = Ey{g^ | E ® {0, F^}), 
so we may let h be this common conditional expectation. 

Next, since all the F-valued coordinates in Z' are relatively independent under A' 
given the X-coordinate, one has 

/ {gi — h){gj — h) dA' = whenever i / j, 
Jz' 

and as M — y oo this implies the simple estimate 

1^ 2|l^ 21^ /In 

m=l m=l m=l 

Hence 

1 ^ 

m=l 

in II • II2 as M — > 00. On the one hand, this implies that his a limit of functions 
measurable with respect to {0, X} hence is itself virtually measurable 
with respect to that cr-algebra. Therefore as a function on X it must actually be 
A-measurable. On the other hand, the above non-vanishing integral now gives 

/ (/o7r')-/idAV0. 
Jz' 
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Therefore E^(/ | A) / 0, so since A defines a C-factor of X this completes the 
proof. □ 

Remark This proof can be presented in several superficially different ways. On 
the one hand, it can be deduced almost immediately from a well-chosen appeal to 
the de Finetti-Hewitt-Savage Theorem, as in the paper [26] of Lesigne, Rittaud and 
de la Rue (see also Section 8.5 in Glasner [18] ). On the other, it is a close cousin 
of the proof that for any idempotent class C, any system X has an extension that is 
'C-sated' (Theorem 2.3.2 in HJ). <] 

In previous applications, the idempotent classes of importance were those of the 
form Cq ^ V • • • V Cq , introduced as examples above. Here we will need some 
slightly more complicated examples, because in order to account for the possible 
relations among the polynomials of a tuple F we will need to consider simultane- 
ously actions of G and also some 'more free' covering group q : G — G. 

Lemma 6.5 Suppose that q : H — > G is a continuous homomorphism ofl.c.s.c. 
groups and that C is an idempotent class of H -systems. Then 

q^C := {G-systems X such that X"?^') G C} 

is an idempotent class of G-systems, and it is hereditary if C is hereditary. 

Proof We must verify that g^C is closed under joinings and inverse limits. Both 
are immediate: if Y is a joining of Xj G q^Q for i = 1,2 then Y^^ ) is the 
corresponding joining of X^^ \ so lies in C because C is closed under joinings, 
and similarly for inverse limits. The last assertion also follows at once from the 
definition. □ 

Definition 6.6 The new class q^ C constructed in the previous lemma is the image 
of C under q. 

Lemma 6.7 If C is an idempotent class of G-systems then 

C := {X : X is a factor of a member o/C} 
is a hereditary idempotent class. 

Proof The hereditary property is built into the definition, so once again it remains 
to check closure under joinings and inverse limits. Both are routine, so we give the 
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proof only for joinings. Suppose that Yj = (Yj, <I>j, fj, Wj) G C for i = 1,2, that 
vTj : Xj — y Yi ate factors with Xj = {Xi,Tji, fii,Ui) € C for i = 1,2, and 
that Z = {Yi X ¥2,^1 (8> $2, A, til x V2) defines a joining of Yi and Y2. Then 
we may define a joining of Xi and X2 as a relatively independent product: letting 
Pi : Yi — > Pr(Xj) be a probability kernel representing the disintegration of Hi 
over vTj, define 

A':= / P(yi,-)®P(y2,-)A(dyi,dy2). 

Now {Xi X X2, Si ® Tj2, A', ui X U2) is a joining of Xi and X2, and hence a 
member of C. The map (xi,X2) (vri(a;i), 7r2(x2)) witnesses Z as a factor of 
this member of C, so Z G C. □ 

Definition 6.8 The class C constructed above is the downward closure ofC 

When we come to apply this machinery, satedness relative only to classes of the 
form Cq W • • • V Cq * will not give us quite enough purchase over our situation. 
Instead we will need to first form an extended group q : G ^ G (in which copies 
of certain subgroups of G have been made 'more independent': see Section [D, 
and then for some subgroups Hi, H2, ■ ■ ■ , Hi < G we, will need to use satedness 
relative to the class 

9.((Cv...vCr). 

In prose, this is 

'The class of G-systems which, upon re-writing them as G-systems, 
become factors of joinings of systems in which one of the Hi acts 
trivially.' 

This manoeuvre will appear during the proof of Proposition l8.2l below. where the 
need for it will become clearer. The particular way in which we will appeal to 
satedness with respect to such a class is captured by the following lemma. 

Lemma 6.9 Suppose that q : H ^ G is a continuous epimorphism of Lie groups, 
that C is an idempotent class of H- systems and that X = (X, is a G- 

system. In addition, suppose that f G L°°{fi) and that 

TT -.Y = {Y,^,u,v) ^X5() 
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is an extension of H -systems such that 



E,(/o7rIC$)y^0. 

Then also 

E^(/|(g,C)S)/0. 

Proof We have Ei,(/ o vr | C<I>) / by assumption, but on the other hand the 
function / o tt is invariant under for every h G ker q: 

fo-KOV^ = fo U^(^) o'ji = fou^oTT = foTT. 

Since is a factor of the whole //-action v, the conditional expectation operator 
Et/( ■ I C<I>) preserves this ker g-invariance. Therefore E,y{f ott\ C<I>) is measurable 
not only with respect to C<I> but also with respect to ^^'^^i. 

Let a : Y — Z be a factor map onto another system which generates the factor 
^kerq Pi ^ jj-^ target system Z is an element of C and has ker q acting triv- 
ially. Therefore this action of H may be identified with an action of G composed 
through q, say Z = W9( ) for some G-system W. (The joint measurability of v 
implies that of the action of G on W, simply by choosing an everywhere-defined 
Borel selector G — > H, as we clearly may for Lie group epimorphisms because 
they are are locally diffeomorphic to orthogonal projections.) 

Now the diagram 



Y 



X5(-) W9{-) 

defines a joining of X'^^'^ and W'^'^. It therefore also defines a joining of X and 
W, by simply identifying it with an invariant measure on X x 1^ and writing the 
actions in terms of G rather than H. 

Our assumption on / gives that E(/ o vr [ a) 7^ 0. Therefore, within this joining of 
X and W, the lift of / has non-trivial conditional expectation onto the copy of W, 
which is a member of g*C, and so by Proposition 16.41 and Lemma [677] this implies 

E^(/|(g,C)S)/0. □ 
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7 The case of two-fold joinings 



The case of Theorem II. ll in which k = I will form the base of an inductive proof 
of the full theorem, and must be handled separately. Its proof is quite routine 
in the shadow of other works in this area, but it does already contain an appeal 
to the van der Corput estimate and an induction on the PET ordering for single 
polynomials (rather than whole tuples). It thus serves as a helpful preparation for 
the full induction that is to come. 

Proposition 7.1 Suppose that tt : G r\ is an orthogonal representation and 
: M X M'" — > G is a polynomial map such that 99(0, •) = e. Then the operator 



Jo 

converge in the strong operator topology for every h, and the limit operator Ph is 
Zariski generically equal to the orthoprojection onto Fix(7r((img (/?))). 

Proof Step 1 First suppose that ip is linear in the first coordinate, meaning that 
ip{-,h) is a homomorphism for every h ^ W. Then for every h the map t 1— 
ip{t, h) takes values in a 1-parameter subgroup of G, and so the classical ergodic 
theorem for orthogonal flows gives 



where is the orthoprojection onto Fix(7r((img Lp{-, h)))). By Corollary 15.5 1 this 
equals Fix(7r((img Zariski generically, and so the proof is complete in the 
linear case. 

Step 2 For arbitrary polynomial maps ip we show by PET induction that if 



for some v & ^, then P^v 7^ 0, where again P^ is the orthoprojection onto 
Fix(7r((img h)))). By decomposing an arbitrary w as (1 — Ph)v + PhV and 
appeahng to Corollary I5.5l again. this will complete the proof. 



averages 






If 
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then the van der Corput estimate lA. II gives that also 

{TT{ip{t + s, h))v, iT{(p{t, h)v) dt ds 



Jo Jo 

= (^4 j Tr{ip{t,h)-^ip{t + s,h))vdtds, v"^ ^ 



'0 Jo 

as T — > oo and then S — oo. 

By the special case of Lemma |4.13| for singleton families we have 

{(t, S, h) ^ Lp(t, + S, h)} ^PET W}, 

and so the inductive hypothesis gives 
rT 



-j 7:{(p{t,h)~^ip{t + s,h))vdt — > Qs 
Jo 



hV as T — > oo 



'0 

with Qs,h the orthoprojection onto Fix(7r((img (p{-, h)~^Lp{- + s, h)))). 
By Corollary 15.51 for every fixed h we have 

Fix(^((img(p(.,/i)-V(- + = Fix(7r((img(^(-,/i)-V(- + -M))) 

for Zariski generic s, and now since (/?(0, h) = e this is equal to 

Fix(7r((img(^(-,/i)))). 

In particular, for every h this equality must hold for Lebesgue-a.e. s, and thus our 
previous average over s may be written instead as 

4 Qs,hvds = 4 Phvds = PhV. 
Jo Jo 

This proves that P^v / 0, as required. □ 

8 A partially characteristic factor 

Now fix the following assumptions for this section and the next: 

• G is an s-step connected and simply connected nilpotent Lie group; 
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• T = (991, 992, . . . , is a tuple of polynomial maps M x M*" — \ G with 
k >2m which (/?! is a pivot, such that (/?i(0, ■) = e for each i, and such that 
G = (img U • • • U img 99^) (otherwise we may simply replace G with this 
smaller group); 

• {Xi , S j , , Uj ) for < i < A; is a tuple of G-systems, and A is a joining of 
them; 

• A^forT G [0, 00) is the family of averaging operators associated to the orbit 
of A under 932, • • • , '/'fe) as in Theorem 1 1.31 so note that these implicitly 
depend on h, the parameter in the ai^gument of the ifi which is not averaged. 

At the heart of the inductive proof of Theorem II Al lies a result promising that in or- 
der to study the functional averages A^{fi, f2, . . . , fk), one may assume that one 
of the functions fi has some special additional structure (which we will see later en- 
ables a further reduction to the case of a simpler family of polynomial maps). This 
extra structure is captured by a simple adaptation of an important idea introduced 
in ifTTl . and which has been used extensively since (see, for instance, 11211 l44l l5l[3l'). 

Definition 8.1 (Partially characteristic factor) In the above setting a factor A < 
Si is partially characteristic for the averages Ai^ if for any tuple of functions 

fi G L°°{fii) one has 

\\AUfi, /2, . . . , fk) - ^t(E(/i I A), /2, . . . , fk) II2 

as T — > 00 for Zariski generic h ( recalling that the operators implicitly 
depend on h (z W). 

Remark The main difference between this definition and its predecessors in ear- 
lier papers is that here, in consonance with the statement of Theorem 11.31 we re- 
quire convergence only for Zariski generic h. 

As stated, this definition allows the Zariski meagre set F C containing those 
h for which convergence fails to depend on /i, /2, . . . , fk- However, it is easily 
checked that for a given h, this convergence holds for all tuples of functions if one 
knows that it holds for tuples drawn from some || • ||2-dense subsets of the unit 
balls of L°°(/ij), i = 1,2, ... ,k. Since one can choose countable such subsets, 
we deduce that there is a countable intersection of Zariski residual subsets of W 
(which is therefore still Zariski residual) on which the above convergence holds for 
all tuples of functions. < 
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As in many of the earlier works cited above, the first step towards proving the 
convergence of A^{fi, . . . , fj^) will be to identify a partially characteristic factor 
with some useful structure. However, a new twist appears in the present setting: 
here we must first pass from G-systems to actions of some covering group of G. 

To be precise, let 

(pi : {t,h) i-> {ipi{t,h), . . . ,ipk(t,h)), 

let 

(fii : {t,h) ^ {ipi{t,h), . . . ,ipi{t,h)), for i = 2,3, . . . , /c, 
(notice the subscripts in different coordinates), and let 

G := (img(^i U img(^2 U • • • U imgipk) < G^^'^ . 

Let q : G — G be the restriction to G of the projection G^ — > G onto the first 
coordinate. Then q intertwines each ipi with ipi for f > 1 (because ipi appears in 
the first coordinate of ipi for every i). 

It is easy to verify that q{G) = G. The group G is connected, because each 
(pi{-,h) passes through the origin for every h, and hence G = expT/ for some 
Lie subalgebra V < q^. The image of V under the first coordinate projection is 
a Lie subalgebra Vi < g, and since G is simply connected it follows that exp Vi 
is a closed subgroup of G which is contained in q{G). On the other hand it must 
contain img (pi for every i < k, so in fact q{G) = exp Vi = G. 

The next technical proposition lies at the heart of all that follows. It provides a 
partially characteristic factor of Xi = (Xi, Si, /ii, ui) for the averages A^, but 

only at the cost of regarding instead the modified system X^*^^. The need for this 
sleight of hand will become clear during the proof. 

Proposition 8.2 Assume that conclusions (1—3) of Theorem U .3\ have already been 
established for all polynomial families preceding T in the PET ordering, suppose 
that (pi is a pivot, and let 

(Recall the discussion following Definition 16.81 ) Then for any systems Xj, i = 
0,1, ... ,k, the factor CSi < Si is partially characteristic. 
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Remark Of course, once this proposition has been proved then it implies some 
conclusion even if A^{fi, ■ ■ ■ , fk) -/-^ for just one value of h, because by fixing 
that h we may simply regard each ipi as a polynomial function of t alone, and so 
apply the proposition with r = 0. Indeed, we will use this trick a few times later. 
However, one must beware of the delicacy that the idempotent class appearing in 
this proposition may not be the same after one makes such a restriction, so nor will 
the (T-sigma algebra CSi in general. Even the group extension q : G — > G itself 
will not be the same as above, but will depend on the choice of h. Since at some 
points later we will really need the above conclusion about the generic behaviour 
of the averages in h, it seems easiest to formulate it as here and then apply it with 
a restricted parameter space when convenient. < 

Proof Since any /i may be decomposed as 

E^(/i|CS) + (/i-E^(/i|CS)) 

and the operator is multilinear, it is enough to prove that if E^(/i | CS) = 
then for any f2, fk one has 

Pto,/2,...,/fc)l|2^0 

as T — > oo for Zariski generic h. Contrapositively, this is equivalent to showing 
that if the set 

E:={heR': \\A^{fu /2, • • • , /fe) lb ^ as T ^ oo} 

is not Zariski meagre then E^(/i | CS) ^ 0. Henceforth we assume that E is not 
Zariski meagre. 

Furthermore, in view of Lemma 16.91 it now suffices to find an extension of spaces 
TT : — and an action u : G r\ {X,T;,jl) such that 

vr o n = n'^( ) and 

E(/i o vr I A) ^ 0, 

where 

k 

i=2 

This is the point at which we have made use of the general properties of idempotent 
classes. This implication will follow in two steps: applying the van der Corput 
estimate (Lemma [A.ll l. and interpreting what it tells us. 

Step 1 Letting 
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the van der Corput estimate implies that forh£E one also has 

rS f-T 

Jo 



-f i I 9t+s,h9t,hdnodtds ^0 
Jo Jo JXo 



as T — > oo and then S — oo. 

For each s, by Lemma ITT] we may re-write the two inner integrals here as 

Jo Jxlx---xXl 

■■■^{fko txr^*'")) ® Uk o d(A 00 A) dt, 

where 

il^iit, s, h) := (pi{s, h)~^(pi{t + s, h) for each i = 1, 2, . . . , A:, 

so V'i : M X R X R*" — G is a polynomial map with the property that -^i (0, • , • ) = e. 

Since A^o A is a joining of two duplicates of each of the G-systems (Xj, T^, fii,Ui) 

for 1 < i < /c, it is invariant under the diagonal transformations u^^^'^^ . Apply- 
ing this within the above integral shows that it is equal to 

JO Jxlx---xxl 

• • • ^ (/. o (/, o ^l^i^'>^m^'^^^)-) d(A 00 A) dt 

with 

ipi{t,s,h) := 'il;i{t, s,h)(fi{t,h)~^ for i > 1 and 
iPi{t,h) := (fi{t,h)(pi{t,h)~^ fori > 2. 

We recognize these as comprising the 1*^* derived family of T, which by Lemma l4.13l 
precedes T in the PET ordering because ipi was a pivot. Let 

— i- 

tp: {t,s,h) ^ {e,il^[{t,s,h),(p2{t,h),i}2{t,s,h), ■ ■ ■ ,ip'f,{t,h),i;[{t,s,h)). 

By the inductive hypothesis, for every h G R^' there are a Zariski residual set 
Ffi (ZR and a joining 9'^ on Xf x X| x • • • x invariant under 
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such that for all s ^ the above integral tends to 

/i ® (/i o nr^^'"^) • • • ^ /fc ® (A o ul'^'-''"')) de'^ 

as T — > oo. Moreover these O'^ are equal to one fixed joining on a Zariski 
residual set of h, so that this must in fact be invariant under {G^"^^ U img ifj). 

Since the Zariski residual set has full Lebesgue measure, for each h our previous 
average over s may now be replaced by 



JO JXly.-y.Xl 

implying that for h ^ E this also does not vanish as S — > oo. 
Next, one has 

{ipi{s, h), (s, /i), . . . , (pk{s, h), ipk{s, h)) 

= (e,e, ... ,ipkis,h)ipi{s,h)~'^,ipkis,h)ipi{s,hy''^) 

■{ipi{s,h),ipi{s,h), . . . ,ipi{s,h),ipi{s,h)) 

— J- 

= V' {s,0,h) ■ {(pi{s,h),(pi{s,h),... ,(pi{s,h),ipi{s,h)) 

for every s, and so each joining O'^ is already invariant under the new off-diagonal 
polynomial flow 

C{-,h) : s ^ {ipi{s,h),ipi{s,h),... ,ipkis,h),ipk{s,h)). 
Since we may re- write the above average as 

/ / (/i«)l0---«)/fc®l)- ((lO/iO---®lO/fc)o4^''''V^''ds, 

^0 Jx^y---yXl 

by the base case Proposition 17. H it must converge to 

(/i ® 1 ® • • • A 1) • E(i ® /i • • • 1 /fc I de^ 

Xlx-xXl 

as S — > oo, where Sx := Sf ^ ® ■ ■ ■ ® S®^ and the conditional expectation here 
is with respect to 6^. 
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Therefore this last integral is nonzero for every h ^ E. Since the sets 
{h: e''^ 6} and {h : '^^^''''^^ / sj;"'^^^ up to 0-neghgible sets} 

both are Zariski meagre (the latter by Corollary 15.5) . their union cannot contain E, 
and so any value h G E that is not in either of these meagre sets witnesses that 

/ (/i 1 • • • /fc 1) • E(i ® /i • • • ® 1 ® /fci si;"'*^^^) de + 0. 

J Xly.---y.Xl 

Step 2 Now set 

i=\ i=\ 

and let vr : X — > X\ be the coordinate projection onto the first copy of X\. 
Observe that the polynomial map defined in Step 1 is simply a copy of Cp\ in 
which each coordinate has been duplicated. Define q\ : G — > G"^^ to be the 
restriction to G of the coordinate-duplicating map 

(51,52, ■■■,gk) (91,91,92,92, ■ ■ ■ ,9k,9k)- 

Composing qi with the Cartesian product action of G"^^ now gives an action u 
of G on {X,T,,jl), since we have already deduced from our inductive hypotheses 
that ft = 6 is invariant under ua (and hence the image of qi o ip^ for each i > 2) 
and also under (img,^) (which is the image of qi o ipi). 

On the first coordinate in Y[i=i ^f, the transformation simply agrees with 
for any g G (img (^2 U • • • U img (fk). On the other hand, 

TT o n^i(*''^) vr o (^xf X X ... X ^^'^^^ x (*''^)) = u^'^''''\ 

Since these cases together generate the whole of G, it follows that tt o = u'CS) 
for all g G G, where q : G — > G is the covering homomorphism constructed 
previously. 

Finally, an inspection of the action u on the other coordinates of X shows that 

• for each i G {2, 3, . . . , /c} the transformations u*^! and agree on 
the first coordinate copy of Xi, and 

• the function E(l(8)/i(8)...(8)l(8)/fc | S^^™^^^) is invariant under the n-action 
of (img(^i). 
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Therefore the non- vanishing of the integral at the end of step 1 asserts that /i o tt 
has a non-zero inner product with a function that is manifestly measurable with 
respect to a system in the class 

k 

and hence E(/i o vr | CSi) ^ 0, as required. □ 

Remarks 1. The above proof makes clear the need to extend the modified sys- 
tem X^*^ \ rather than Xi itself. We constructed our extension from some joining 
on X • • • X through the coordinate projection onto Xi, and in order to de- 
rive the desired nonzero conditional expectation for it we needed the polynomial 
trajectory of transformations uf^^^'^^ downstairs to lift to the trajectory 

The new map (pi may not be a PET-minimal member of [ipi, . . . , (^fe), and it also 
may not share its leading term with any of the lifts (fi for i > 2, even if ipi down- 
stairs does have some leading terms in common with the other (pi. Thus in order 
to write these (pi as genuine lifts of the (pi we must first split the group G apart 
slightly in order to separate these leading terms. Happily, the problem itself gives 
us a natural way to do this: the hfted polynomial mapping ipi is suitably 'sepa- 
rated' from Pi, i > 2, inside the Cartesian product G'', so we have simply taken G 
to be the closed subgroup of G'^ generated by these hfted mappings and composed 
our actions with the quotient map q : G — > G. 

2. If a factor A < Si is partially characteristic and we assume that the hmits 
a'* = limy j-oo sxist, then considering the integral formula 

/ /o8)/i®---«)AdA^= / fo-A^{h,...,fk)dno 

shows that for Zariski generic h the coordinate projection Xi — )■ Xi is rel- 
atively independent under A'^ over its further factor generated by A < Ei. Thus, 
knowledge of a non-trivial partially characteristic factor gives some structural in- 
formation about the hmit joining. 

In particular, consider a case in which g is an isomorphism (so that the subgroups 

(img pi) and (img 992 U • • • U img pk) are already sufficiently 'spread apart' in G), 
and suppose furthermore that the factor CXi can itself be expressed as a joining 

of systems Zq G cJ,™^'^'^ and G cj"'^^'^"^'"'^^ for i > 2 (rather than just 
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as a factor of such). Then we know that any Umit joining A' must be relatively 
independent over the factor CXi, and upon restricting ourselves to this factor we 
can express A' alternatively as a joining of 

Xo, Zq, Z2, . . . , Zfc, X2, . . . , Xfc. 

(In fact we will use a similar manipulation in the next section). Moreover, the 
assumption that CXi itself be a joining is not terribly restrictive, since an arbitrary 
system Xi always has an extension for which this is true (by using the machinery 
of 'C-sated' extensions, as developed in Chapter 2 of HI). 

It would be interesting to know whether further use of the ideas behind Proposi- 
tion [8i2] could give a more complete picture of the possible structure of A'. This 
would presumably involve repeated assertions of relative independence over in- 
creasingly 'small' factors of the original system, on which increasingly large sub- 
groups of G act trivially. Such a picture does emerge in the study of the linear 
multiple averages constructed from a tuple of Z'^-actions (see Chapter 4 of [T]), 
but in the present setting the need to keep track of a large family of different sub- 
groups of G may make the resulting description more obscure. 

Even without a manageable description, this kind of result suggests that the limit 
joining A' of Theorem 1 1.1 1 not only exists, but exhibits some rigidity over different 
possible initial joinings A, since A' must exhibit these various instances of rela- 
tive independence. Once again there is a superficial analogy here with the study 
of unipotent flows on homogeneous spaces, where a central theme is the classifi- 
cation of all possible invariant measures and the rigidity that such a classification 
entails; but once again, I do not know whether this points to any deeper connexions 
between that setting and ours. <\ 

9 Proof of the main theorem 

We can now complete the proof of Theorem 11.31 The general case is handled by a 
'spiral' PET induction on the tuple {ipi, . . . , ipk)'- for each such tuple we will show 
that 

(assertions (1,2,3) for {ip2, ■ ■ ■ ,'Pk)) 

(assertion (1) for (¥?i,(/?2, • • ■,^k)) 

=^ (assertion (2) for (9?!, 932, • • • , Vfc)) 

(assertion (3) for (991, 992, • • • , V^fe)), 
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at which point the induction closes on itself. 

We retain the assumptions from the start of Section |8] Proposition 18.21 gives the 
purchase needed to complete our induction. Let the class C and group extension 
q : G — > G be as in the preceding section. In analysing the family of averages 

^T(/l,/2,...,/fc), 

Proposition allows us to assume that /i is measurable with respect to the factor 
CSi, or equivalently that Xi is itself a system with the property that the G-system 
X^*- ^ is a factor of a member of the class 

k 

From this point a careful re-arrangement gives a reduction to the conclusions of 
Theorem 11.21 for the group G and family {(p2, ■ ■ ■ ,'fk), which is isomorphic to 
{ip2 , . . . ,ipk) and hence precedes {ipi,ip2, ■ ■ ■ ,(pk)in the PET ordering (see Lemma |4.13t . 
Note that this holds in spite of our ascent from G to G, because we have now re- 
moved (fi from the picture altogether. 

In order to set up the necessary re-arrangement, assume that Xi = CXi. By the 
definition of C there are a system X G cf™^'^'^ V Vj=2 d^'^^'^''^^'^ and a factor 
mapvr : X — > X.f\ 

Now let Xi := X and Xj := X^^ for any i / 1, and choose any lift of A to a 
joining A of the Xj (for instance, one could use the relatively independent product 

over A). For each i ^ 1 consider the factor ^ < Si, and let 

Ci '■ ^1 > 

be a factor map of standard Borel G-space which generates this factor. These may 
be realized as factors of the joining A through the coordinate projection fli Xi — y 
Xi. Crucially, by enlarging each of the systems Xj for i 7^ 1, we can arrange that 
under A each of these factor maps to Zi is also virtually measurable with respect 
to the Xj-coordinate, as well as the -coordinate. To this end, for each i ^ I 
consider the composition 

coord, proj. ~ ~ CiXid ^ 

XqX ■■■ X Xk Xix Xi Zi X Xi. 

Since this composition respects the G-actions, it defines a joining of Zj with Xj, 
which we denote by Yj = (Yj, <I>j, fj, Uj). Let rji : Yi — > Xi be the second 
coordinate projection. 
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Thus we have constructed a collection of factorizations 



{Xq X • • • X Xfc,So (8) • • • (g) tk,X,UA) coor .proj. ^ {Xi,ti, ili,Ui) 




{Yi,^i,Ui,Vi) 



for each i G {0, 2, 3, . . . , k}. Putting these together with the coordinate projection 
Xq X ■ ■ ■ X Xk — > ^1 therefore gives a measure-theoretic isomorphism 

(Xo X • • • X Xfe, So • • • ® Sfe, A, ua) 

^ (Fo X Xi X y2 X ■ ■ ■ X Ffc, $0 ® Si (g) #2 (8> • ■ ■ 6*, va) 
for some joining 6 of G-systems. 

In addition, this construction guarantees that the factor maps 
XoX-.-xX, X,^Z, 

and 

~ ~ coord, proj. _ 

XoX---xXk — >Yi ^ ' Zi 

agree up to A'-negligible sets. Therefore any h G L°°{jli) which is measurable 

with respect to 5]^™^*^^'^' ^ (equivalently, which is invariant under (img 
with the convention that (po = e) has an essentially unique counterpart h' G 
L°°{vi) which lifts to the same function on Xq x ■ ■ ■ x X^ up to A-negligible 
sets, and which is invariant under the same subgroup of G. 

Lemma 9.1 In the situation described above, consider the averaging operators 
associated to the lifted family of polynomial maps (pi ■.'R xW — > G. Suppose 
that fi G L°°(fli) is a function of the special form 

g-h2 hk, 

where g G L°°{jli) is invariant under (imgc^i) and each hi is invariant under 
(img Then for any other functions fi G L°°{p,i)for i ^ 1 one has 

Akh, /2, • • • , fk) = E(5' • A't{1, h'^{f2 o 7?2), . . . , h'kifk o %)) I m) 

(recalling that has range in L°°{iiq), while has range in L°°{i'q)), where 
g' and h[ are the counterparts of g and hi introduced above. 
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Proof By the definition of and this follows from the analogous calcu- 
lation at the level of joinings. For the joinings A and 9, the above isomorphism 
gives 

^ / /o ^ (/i o nf(*'")) ® • • • (/fc o dXdt 
^ I (/oo7?o)^(/ionf 

JYoxXixY2X---xYk 

0(/2 o o (*''^)) ^ . . . (/fc o r?, o (*''^)) dO dt. 

On the other hand, our assumptions on the structure of /i imply that 

<^onf =5 and h, o uf'^'^''^ = hi o uf^'^''\ i = 2,3, ... ,k, 

for all {t,h). Also, the counterparts 5' G L°°(i'o) and /i^ G L°°{vi) for i > 2 
satisfy 

5' (2/0) = 5(^1) and h'-{yi) = hi{xi) 

for 0-almost every {i/q, xi,y2, ... ,yk)- The above integral with respect to 9 may 
therefore be re-written as 

-f^ [ (5'(/0O??0))®lx,®((/i2(/2°r?2))ot;f^(*'"V 
JO JYoxXixY2X---xY^ 

■■■®{{K{fkor,u))ovl''^'^^^)d9dt. 
Regarded as a linear functional applied to /o, this is integration against 

E(5-^f.(l, /l2(/2 0??2), hk{fkOVk))\m), 

as required. □ 

Of course, the importance of the above lemma is that on the right-hand side there is 
no non-trivial function in the first entry under A^. This now leads quite smoothly 
to a completion of our spiral induction. 

Proof of Theorem ll.3l In case k = 1, extends to abounded operator L'^{^i) — ; 
L'^ifJ-o) and the desired assertions of convergence and genericity become simply 
that (i) the average 
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converges to M^Phf with Ph the conditional expectation onto sj™^'^ ' , and 
(ii) this is generically equal to M^Pf with P the conditional expectation onto 
^(imgip) gQj-j^ these assertions follow at once from Proposition 17. II 

It remains to handle the inductive step in case k > 2. Assume that properties (1-3) 
have already been proved for all tuples preceding T in the PET ordering. We will 
deduce those properties for T in order. 

Property (1) In this step, by fixing one h throughout the proof and replacing G 
with its subgroup (img ipi{-,h) U ■ ■ ■ U img ipk{-,h)) if necessary, we may assume 
that each ipi is a function of t alone, and hence that r = 0. With this agreed, let 
q : G — )• G and the class C be constructed as before using this new group and 
tuple of maps. 

By re-ordering T if necessary we may also assume (pi is a pivot. In this case, by 
Proposition l8.2l it suffices to show that the averages ■ ■ ■ , fk) converge when 

/i is CSi-measurable. 

Construct the G-systems Xj and Yj as above. Lifting /i to /i o vr € L°°{fli), on 
this larger system we know that it can be approximated in L^(/ii) by finite sums of 
the form 

Qp ■ ^2,p • • • • hk^p, 

p 

where gp G L°°{fli) is invariant under (img(^i) and each hi^p G L°°{fli) is invari- 
ant under (img (pi(p~^). 

Appealing first to the uniform continuity of the operators Aj, in each entry sep- 
arately, and then to the linearity of these operators in the first entry, it therefore 
suffices to prove convergence of the averages 

^t(/i' • • • 1 /fc) 

whenever /i is one such product function. However, this case lands within the 
hypothesis of the preceding lemma, which converts these into averages of the form 

E{g ■ /l2(/2 o?/2), • • • , hkifkOVk)) I m)- 

The norm convergence of these now follows from the norm convergence of the 
averages A^{1, /i2(/2 ° ?/2), • • • , hk{fk°'nk)), which is promised by the inductive 
hypothesis applied to the simpler polynomial family {ip2: ■ ■ ■ ,Vk)- 

Property (2) Of course, property (1) already implies convergence of the aver- 
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aged couplings 
Jo 

as T — y oo to some limit A''. We must next show that for any tuple of functions 
fi G {fJ,i), the A'^-integrals are the same whether we integrate fo® fi^ ■ ■ ■ "S) fk 

or (/o o liSo) (g) (/^ o tiSi) (g) . . . (g) (j^ o tiSfe) for any 

{9o,9i,---,9k) eG^^''+^^ or (50,51, ••• ,5fc) S (img ^ (•, /i)). 

This will give the invariance of A'* under the Ux action of U img 

{■,h)). 

As in the case of property (1), in this step we can fix a choice of h and replace G 
with the subgroup := (img ipi{-,h) U ■ ■ ■ U img ipk{-, h)) if necessary, so that 
we may assume r = 0. 

Since 

E(/ilCSi)o^xf = E(/ionf |CSi) 

for any g, by Proposition 18.21 it again suffices to treat the case when /i is (CSi)- 
measurable. Now we may consider again the previous construction of the G- 
sy stems Xj and Yj and their joinigs A' and 6. In these terms we wish to prove 
that 

/ /o ® /i ® • • • O /fc dA' 

= I (/oong°)C55(/iottf )C55---®(/feonf )dA' 
for any tuple fi € L°°{jii) and any 

(50,51, ••• e G^^*"^^^ or (^0,51, ••• ,5fc) e (img (•))> 
where A' is the limit joining obtained by averaging A. 

Arguing again as for property (1), by continuity and multilinearity we may now 

assume that /i is of the special form 5 • /12 assumed by Lemma [OTT] and 

so by that lemma it now suffices to prove that 

I {g'ifo o %)) ® 1 ■ ■ ■ ® (/ifc(A o m)) de' 

JYoxXoxYzx-xYk 

= I ((5'(/oor?o))ot;^»)0l0...0((/i',(/,o%))ot;f)d0', 
JYoxXoxY2X---xYk 
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where 9' is the Umit joining obtained by averaging 9. With this re-arrangement the 
coordinate in Xi vanishes from the picture, and what remains is just an instance 
of property (2) for the simpler tuple of polynomial maps {(p2, . . . ,(pk), which is 
known by induction. 

Property (3) Lastly, we must show that there is a Zariski residual set E 
such that for any tuple of functions fi the limit 

lim / /o-^^(/i,...,/fc)d^o 

is the same for all h G E, which will imply that the map /i i— )■ A'^ is Zariski 
generically constant (and hence, by property (2), that this generic value must be 
invariant under the whole of (G'^*^'^+^) U img ^)). In this step, of course, we may 
not restrict to a single value of h. 

Clearly it suffices to prove this ^-independence for functions fi drawn from count- 
able II • llg-dense subsets of L°°{^i), and since a countable intersection of Zariski 
generic sets is Zariski generic we may therefore look for such a Zariski generic set 
for just a single tuple of functions fi. 

The full strength of Proposition I8.2l and our construction above now give a Zariski 
residual subset E C W , extensions of G-systems vr : — > '^'f ^ ^rid a joining 
A of G-systems such that 

J X{)X---xXi, 

= lim / (/o o vTo) • o TT I A),/2 o7r2,. . . ,/fc O7rfc)d/io 

T — >oo Jx^^ 

for all h £ E, where now 

k 

1=2 

Clearly it suffices to show that the desired /i-independence holds on some further 
Zariski residual subset of E, and now the same manipulations as above give a 
reduction of this to a proof that the limits 

^lim / (5o(/oo?/o)) -^7^(1, ^2(/2 0772), •••,/ife(/feO %))di^o 

are independent of h on some Zariski residual set, where 9 and the Yi have been 
constructed from A and the Xi as previously. The dependence on h in this ex- 
pression is all in the off-diagonal polynomial trajectory that appears in the average 
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A^. Once again, the fact that this limit is generically constant now follows from 
the inductive hypothesis applied to the family {ip2, • • • , <^fc), and so the proof is 
complete. □ 



10 Further questions 

10.1 Other questions in continuous time 

Theorems 11.11 and 11.21 suggest many possible extensions involving different kinds 
of averaging, just as for any other equidistribution phenomenon. The following 
paragraphs contain a sample of these possibilities. 

First, given another connected nilpotent group G', one could ask more generally 
about polynomial maps ipi : G' — > G and the resulting off-diagonal averages 
along a F0lner sequence of subsets F/v ^ G'. Do these always converge as in our 
main theorems? This seems likely, and I suspect that the methods of proof above 
can provide significant insight into this question, but it may be tricky to set up the 
right generalization of PET induction. 

A little more abstractly, the off-diagonal polynomial trajectory 

{{ipi{t),ip2{t),...,ipk{t) : teR} 

is a semi-algebraic subset of G^ in the sense of real algebraic geometry (see, for 
instance, Bochnak, Coste and Roy lITTI ). Could it be that convergence as in The- 
orems dH] or [O] holds along the intersections of increasingly large balls with any 
semi-algebraic subset V <Z G^, endowed with a suitable surface-area measure? 

A more challenging question concerns the assumption that G be nilpotent. Do 
Theorems 11.11 or 11.21 still hold if we assume only that G is an arbitrary connected 
and simply connected Lie group? This is probably too much to ask, but some 
progress may be possible, for instance, if each Lpi has image lying within a unipo- 
tent subgroup of G. This seems a natural setting to investigate in view of Rat- 
ner's Theorems giving equidistribution and measure rigidity for unipotent flows 
on homogeneous spaces [34l [33l [3511361 . and Shah's extension of these results to 
averages over regular algebraic maps [|37l . 

However, as remarked in the Introduction, the methods used to study homoge- 
neous space flows are very different (and mostly much more delicate) from those 
explored in this paper. Shah's analysis of regular algebraic maps proceeds by first 
obtaining the invariance of a weak limit measure under some unipotent subgroup 
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and then using the resulting structure promised by Ratner's Theorems, whereas it 
is an essential feature of our inductive proof of Theorem 11.31 that the cases of ho- 
momorphisms Lpi and of more general polynomial maps must be treated together. 

To illustrate more concretely some of the difficulties posed by non-nilpotent groups, 
consider the functional averages 

Jo 

for ajointly measurable probability-preserving system {X, S, /i, u) for G = SL2(M) 
and with ui,U2 : M — > SL2(M) parametrizing the upper- and lower-triangular 
subgroups respectively. (These averages are easily expressed in terms of the nat- 
ural analog of Theorem 11.21 ) If we assume that these averages do not tend to 
for some choice of /i, /2 € L°^{fi), then the van der Corput estimate and a re- 
arrangement give also 




/i • (/i o uf) ■ ((/2 • (/2 o n|)) o uiuf) dfidtds ^ 



as T — > 00 and then S — > 00. In order to use this, we need some informa- 
tion about the averages along the trajectory t 1-^ u\u'^^ in G. This is certainly a 
polynomial map in the sense of real algebraic geometry, but not in the sense of 
Definition 14. 1[ so further differencing does not seem to lead to a simplification of 
the problem. I have not examined in detail what other arguments (for example, 
using the representation theory of SL2(M)) might be brought to bear here, since 
this is only a very special case: it simply serves to illustrate that the method of PET 
induction cannot be applied so naively in this setting. 

Finally, linked to the study of convergence and equidistribution is the problem of 
describing the limit joinings A'. Some information on their possible structure is 
contained in the proof of Proposition 18 . 2 1 above, as remarked after that proposition, 
but it would be interesting to know whether they can be classified more precisely, 
possibly after extending each Xj to a suitably-sated extension. A discussion of 
related issues in the setting of Z'^-actions can be found in UJ. 

10.2 Discrete actions 

Most past interest in the kind of off-diagonal average appearing in Theorem ll.ll has 
focused on actions of discrete groups. Suppose that F is a discrete nilpotent group, 
(/?!, 992, • • • , Vfc : ^ — ^ r are polynomial maps (according to the obvious relative 
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of Definition 14. II) . Xj = {Xi,Tji, fii,Ti) are probability -preserving F-systems for 
1 < i < k and A is a joining of the systems X,. Much recent work has been 
directed towards understanding whether the off-diagonal averages 

1 ^ 

n=l 

converge to some limit joining as N — > oo, or whether the associated functional 
averages converge. Several partial results have appeared, and at the time of this 
writing Miguel Walsh has just settled the general case in his preprint [43 1. 

Walsh's approach does not use heavy ergodic-theoretic machinery. It relies on 
reformulating the problem of norm convergence for the functional averages into a 
problem asking for some 'quantitative' guarantee that one can find long intervals 
of times N in which those averages are all close in || • ||2. This new assertion can 
then be proved by a clever induction on the tuple of polynomial maps {(pi , . . . ,(pk), 
which is apparently different from Bergelson's PET induction. 

In making this reformulation, Walsh uses ideas that have some precedent in Tao's 
proof of convergence when r = Z'^ and all the ipi are linear ([40]). Some of 
these ideas lie outside more traditional ergodic-theoretic approaches to this class of 
questions (such as the present paper), and they have the consequence that very little 
can be gleaned about the structure of the limits (functions or joinings). Therefore 
it would still be of interest to see a proof that gives some additional information, 
similar to our Theorem 1 1.3 l or to the earlier, even more precise results of lISTl or ll44l 
in the case of discrete powers of a single transformation. We finish with an informal 
discussion of the difficulties that face any attempt to adapt the arguments of the 
preceding sections to the setting of discrete F. 

The first and most obvious difficulty is that if these averaged couplings do converge 
to some limit A', it need not be invariant under the off-diagonal subgroup 

(img(v?i,...,(y9fc)) < F^ 

Indeed, let F = Z, let ipi = and ^P2{n) := v?, and let Xi = X2 be the system 
given by the generator rotation on Z/4Z. Since all square numbers are congruent 
to either or 1 mod 4, it is easily computed that the limit obtained by averaging 
the diagonal joining A is simply 

iA + l(idxr),A, 
which is not (id x T) -invariant. 
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Of course, this is a trivial example, but it is not clear whether this kind of arithmetic 
system, appearing as a factor of more general systems X,, is the only possible 
obstruction to the desired extra invariance of the limit joining. 

While this example bears only on the possible symmetries of the limit joining, in 
the continuous-time setting those symmetries play a crucial role in the proof of 
Proposition 18.21 above, and so the whole method of proof we have used in this 
paper may need substantial modification before it can give convergence results in 
the discrete-time world. 

A second difficulty worth remarking is the absence of any useful replacement for 
the notion of Zariski genericity in the discrete-time setting. Of course, Corol- 
larv l5.5| is still true for discrete group actions: the problem is that it tells us nothing, 
because these groups are themselves countable. 

It might be worth exploring a more subtle appeal to the reasoning of Corollary 15.31 
in place of Corollary 15.51 The statement of Corollary I5.3l is also still true for dis- 
crete groups provided the subgroups Hi and H2 are both normal in {Hi VJ H2). 
One possibility might begin as follows. If S)i, S)2, . . . , is a sequence of closed sub- 
spaces of a Hilbert space S), any two of which are relatively orthogonal over some 
common further subspace and if in addition x £ S)is such that inf„ ||Pn2;|| > 
with Pn the orthoprojection onto Sjn, then x also has a nonzero projection onto 
^ (for otherwise the PnX would be an infinite sequence of mutually orthogonal 
projections of a single vector, all of them large, contradicting Bessel's Inequality). 

Structure like this has previously been identified within orthogonal representations 
of a finitely generated nilpotent group by Leibman ll24l . Using this reasoning, for 
example, one can show that if 

r = (a, 6 I [a, b] =: c is central) 

is the discrete Heisenberg group and T : T r\ {X, T,, fi) is any action of it, then 
the c7-subalgebras 

^i'^) := {A G E : fiiT^AAA) = 0} 

and S^^^ are relatively independent over the fully invariant factor S^, even though 
in this discrete setting it can happen that S^"^ / s^"^" and S^"^ is not globally 
T-invariant. This follows because a judicious appeal to the discrete version of 
Corollary I5.3l implies that the fi-algebras 
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are all relatively independent over Y]^'^'^\ where (a, c) is normal in T. If now / and 
g are T"- and T*-invariant respectively, then applying gives 

I /•E(<7|S<"))d/i = y"(/.E(5|s<")))or^'=d/i = y"(/or^')-(E(5|s^"<">^')d/.. 

Therefore the non-vanishing of this integral implies that g actually has uniformly 
nonzero conditional expectation onto every J]^ . Hence by the argument 
sketched above, it must actually have nonzero conditional expectation onto T,^^''^\ 
and similarly / must have nonzero conditional expectation onto These two 

o"-algebras are now globally T-invariant and relatively independent over S^, so 
putting this together shows that S^"^ and S^^^ are themselves relatively indepen- 
dent over S^. 

In order to use a similar idea to study off-diagonal or multiple averages, one might, 
for instance, try to prove a discrete analog of Proposition 18.21 according to which 
the characteristic factors obtained depending on h are not mostly equal to each 
other, but are all relatively orthogonal over some common smaller cj-algebra A'. 
Then it might be possible to replace A'' with A' in subsequent arguments and gain 
more purchase on the asymptotic behaviour of our averages as a result. However, I 
do not have a precise statement to formulate based on this speculation. 
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A A continuous-time van der Corput estimate 



We recall here for completeness a continuous-time variant of the classical van der 
Corput estimate for bounded Hilbert-space-valued sequences. The discrete-time 
version can be found in Section 1 of IITtI . and a continuous-time version in Ap- 
pendix B of Potts EH. 

Lemma A. 1 If u : [0, oo) — > is a bounded strongly measurable map into a 
Hilbert space, then vector-valued non-convergence 




u{t) dt -/-^ as T — > oo 



implies the scalar-valued non-convergence 




{u{t + s),u{t)) dt ds -/-^ as T — > oo and then S — > oo. 
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